Communication Optimization for Multi GPU Implementation of Smith-Waterman Algorithm

نویسندگان

  • Sampath Kumar
  • P. K. Baruah
چکیده

GPU parallelism for real applications can achieve enormous performance gain. CPU-GPU Communication is one of the major bottlenecks that limit this performance gain. Among several libraries developed so far to optimize this communication, DyManD (Dynamically Managed Data) provides better communication optimization strategies and achieves better performance on a single GPU. SmithWaterman is a well known algorithm in the field of computational biology for finding functional similarities in a protein database. CUDA implementation of this algorithm speeds up the process of sequence matching in the protein database. When input databases are large, multi-GPU implementation gives better performance than single GPU implementation. Since this algorithm acts upon large databases, there is need for optimizing CPU-GPU communication. DyManD implementation provides efficient data management and communication optimization only for single GPU. For providing communication optimization on multiple GPUs, an approach of combining DyManD with a multi-threaded framework called GPUWorker was proposed. Our contribution in this work is to propose an optimized CUDA implementation of this algorithm on multiple GPUs i.e., GPUWorker-DyManD which reduces the communication overhead between CPU and multiple GPUs. This implementation combines DyManD functionality with GPUWorker for optimizing communication. The performance gain obtained for the GPUWorker-DyManD implementation of this algorithm over default multi-GPU implementation is 3.5x.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerating the Smith-Waterman Algorithm for Bio-sequence Matching on GPU

Nowadays, GPU has emerged as one promising computing platform to accelerate bio-sequence analysis applications by exploiting all kinds of parallel optimization strategies. In this paper, we take a well-known algorithm in the field of pair-wise sequence alignment and database searching, the Smith-Waterman (S-W) algorithm as an example, and demonstrate approaches that fully exploit its performanc...

متن کامل

An approach to Improve Particle Swarm Optimization Algorithm Using CUDA

The time consumption in solving computationally heavy problems has always been a concern for computer programmers. Due to simplicity of its implementation, the PSO (Particle Swarm Optimization) is a suitable meta-heuristic algorithm for solving computationally heavy problems. However, despite the simplicity, the algorithm is inefficient for solving real computationally heavy problems but the pr...

متن کامل

Parallel Implementation of Particle Swarm Optimization Variants Using Graphics Processing Unit Platform

There are different variants of Particle Swarm Optimization (PSO) algorithm such as Adaptive Particle Swarm Optimization (APSO) and Particle Swarm Optimization with an Aging Leader and Challengers (ALC-PSO). These algorithms improve the performance of PSO in terms of finding the best solution and accelerating the convergence speed. However, these algorithms are computationally intensive. The go...

متن کامل

GPU-SW Sequence Alignment server

We present a complete sequence homology search server based on the hybrid CPU/GPU implementation of the Smith Waterman algorithm for sequence alignment. We discuss system architecture, division of the tasks between CPU and GPU in the hybrid design, the scalability issues and hardware requirements. The performance of the server is compared with the state-ofthe-art sequence analysis servers. Bioi...

متن کامل

GPU Accelerated Smith-Waterman

We present a novel hardware implementation of the double affine Smith-Waterman (DASW) algorithm, which uses dynamic programming to compare and align genomic sequences such as DNA and proteins. We implement DASW on a commodity graphics card, taking advantage of the general purpose programmability of the graphics processing unit to leverage its cheap parallel processing power. The results demonst...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013